Audio conversation device, method, and robot device

ABSTRACT

In a conventional voice dialogue system, there is a case where it is difficult to perform a natural dialogue with the user. Therefore, we designed to perform speech recognition on the user&#39;s utterance, to control a dialogue with the user according to a scenario previously given, based on the speech recognition result to generate an answering sentence corresponding to the contents of the user&#39;s utterance as the occasion demands, and to perform voice synthesis processing to one sentence in the reproduced scenario or the generated answering sentence.

TECHNICAL FIELD

The present invention relates to a system and a method of voice dialogueand a robot apparatus, and is suitable to entertainment robots, forexample.

BACKGROUND ART

Dialogues performed by voice dialogue systems with human beings by voiceare classified into two types of methods depending on the contents. Theyare “dialogue having no scenario” and “dialogue having scenario”.

Among them, the “dialogue having no scenario” method is a dialoguemethod called “artificial unintelligence”, which is realized by a simpleanswering sentence generation algorithm typified by the Eliza (seenon-patent document 1).

In the “dialogue having no scenario” method, as shown in FIG. 36, theprocessing is performed by repeating a repeat of the procedure (stepSP92) that if the user utters some words, the voice dialogue systemperforms speech recognition on it (step SP90), and generates ananswering sentence according to the recognition result and emits this bysound (step SP91).

A problem in this “dialogue having no scenario” method is that dialoguedoes not progress if the user does not utter. For example, if a responsegenerated in step SP91 in FIG. 36 is the contents urging the user to thenext utterance, the dialogue progresses, however, if it is not, forexample, if the user becomes into the state “cannot say the next word”,the voice dialogue system continues to await the user's utterance andthe dialogue does not progress.

Furthermore, in the “dialogue having no scenario” method, the dialoguedoes not have scenario, so that also there is a problem that it isdifficult to generate an answering sentence considered in a flow ofdialogue at the time of generating a response in step SP91 in FIG. 36.For instance, it is difficult to perform the processing that afterhaving heard the user's profile over, the voice dialogue system makes itreflect in the dialogue.

On the other hand, the “dialogue having scenario” is a dialogue methodin which the dialogue is progressed by that the voice dialogue systemsequentially utters according to a predetermined scenario, and it isprogressed by the combination of the turn in which the voice dialoguesystem one-sidedly utters, and the turn in which the voice dialoguesystem questions the user and further responds to the user's answer tothe question. Note that, “turn” means an utterance that is clearlyindependent in a dialogue or one unit of a dialogue.

In the case of this dialogue method, the user is good only to answer tothe question, so that the user does not lose what he/she utters.Furthermore, the user's utterance can be limited by the contents ofquestions, so that the design of answering sentence is comparativelyeasy in the turn that the voice dialogue system further respondsaccording to the user's answer. For example, as a question from thevoice dialogue system to the user in this turn, it is good to prepareonly two types for “yes” and “no”. Additionally, also there is anadvantage that the voice dialogue system can generate an answeringsentence by using a flow of story.

Patent Document 1 “Artificial Unintelligence Review”, [on line],[searched on Mar. 14, 2003 (Heisei 15)], Internet <URL:http://www.ycf.nanet.co.jp/-skato/muno/review.htm>

However, also this dialogue method has problems. First, it is that sincethe voice dialogue system can only give utterance according to thescenario previously designed by assuming the contents of the user'sanswer, the voice dialogue system cannot respond when the user utteredunexpected words.

For example, to the question that can be answered by “yes/no”, if theuser replied that both of them are okay, he have never thought aboutsuch a thing, or the like, the voice dialogue system cannot make anyresponse, or even if it responds, it can be only extremely unsuitableresponse as a response to the user's answer. Furthermore, in such case,the possibility that after that, the story becomes unnatural is high.

Secondly, it is that the setting of the degree of the appearance ratioof the turn in which the voice dialogue system one-sidedly utters to theturn in which the voice dialogue system questions the user and furtherresponds according to the user's answer to the question, is difficult.

Practically, in the above voice dialogue system, if the former turn istoo frequent, it gives an impression that the voice dialogue system isone-sidedly uttering to the user, and the user does not feel “making adialogue”. Conversely, if the latter turn is too frequent, it gives afeeling that the user is answering a questionnaire or inquisition to theuser; also in this case, the user does not feel “making a dialogue.”

Accordingly, it can be considered that by solving such problems in theconventional voice dialogue systems, a voice dialogue system can makenatural dialogue with the user, and its practicability and entertainmentability can be remarkably improved.

DESCRIPTION OF THE INVENTION

The present invention has been done considering the above points, andprovides a voice dialogue system, a voice dialogue method and a robotapparatus that can perform a natural dialogue with the user.

To solve the above problems, according to the present invention, in thevoice dialogue system, dialogue control means for controlling a dialoguewith the user according to a scenario previously given, based on aspeech recognition result by speech recognition means for performingspeech recognition on the user's utterance, and response generatingmeans for generating an answering sentence corresponding to the contentsof the user's utterance, responding to a request from the dialoguecontrol means are provided. The dialogue control means makes a requestto the response generating means to generate an answering sentence asthe occasion demands, based on the contents of the user's utterance.

Consequently, in this voice dialogue system, it can be prevented that adialogue with the user becomes unnatural, and also a feeling of “makinga dialogue” can be given to the above user.

Furthermore, according to the present invention, a first step forperforming speech recognition on the user's utterance, a second step forcontrolling a dialogue with the user according to a scenario previouslygiven, based on the speech recognition result, and if needed, generatingan answering sentence corresponding to the contents of the user'sutterance, and a third step for performing speech synthesis processingto one sentence in the reproduced scenario or the generated answeringsentence are provided. In the second step, an answering sentencecorresponding to the contents of the user's utterance is generated asthe occasion demands, based on the contents of the user's utterance.

Consequently, by this voice dialogue method, it can be prevented that adialogue with the user becomes unnatural, and also a feeling of “makinga dialogue” can be given to the above user.

Furthermore, according to the present invention, in the robot apparatus,dialogue control means for controlling a dialogue with the useraccording to a scenario previously given, based on a speech recognitionresult by speech recognition means for performing speech recognition onthe user's utterance, and response generating means for generating ananswering sentence corresponding to the contents of the user'sutterance, responding to a request from the dialogue control means areprovided. The dialogue control means makes a request to the responsegenerating means to generate an answering sentence as the occasiondemands, based on the contents of the user's utterance.

Consequently, in this robot apparatus, it can be prevented that adialogue with the user becomes unnatural, and also a feeling of “makinga dialogue” can be given to the above user.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a perspective view showing the external structure of a robotaccording to this embodiment.

FIG. 2 is a perspective view showing the external structure of the robotaccording to this embodiment.

FIG. 3 is a conceptual view for explaining the external structure of therobot according to this embodiment.

FIG. 4 is a conceptual view for explaining the internal structure of therobot according to this embodiment.

FIG. 5 is a block diagram for explaining the internal structure of therobot according to this embodiment.

FIG. 6 is a block diagram for explaining the contents of processing by amain control part relating to dialogue control.

FIG. 7 is a conceptual view for explaining the structure of a scenario.

FIG. 8 is a schematic diagram showing the script format of each block.

FIG. 9 is a schematic diagram showing an example of the programstructure of a one-sentence scenario block.

FIG. 10 is a flowchart showing the procedure for reproducingone-sentence scenario block.

FIG. 11 is a schematic diagram showing an example of the programstructure of a question block.

FIG. 12 is a flowchart showing the procedure for reproducing questionblock.

FIG. 13 is a schematic diagram showing an example of a semanticsdefinition file.

FIG. 14 is a schematic diagram showing an example of the programstructure of a first question/answer block.

FIG. 15 is a flowchart showing the procedure for reproducing firstquestion/answer block.

FIG. 16 is a schematic diagram showing types of tags to be used in aresponse generating part.

FIG. 17 is a schematic diagram showing an example of an answeringsentence generating rule file.

FIG. 18 is a schematic diagram showing an example of the answeringsentence generating rule file.

FIG. 19 is a schematic diagram showing an example of the answeringsentence generating rule file.

FIG. 20 is a schematic diagram showing an example of the answeringsentence generating rule file.

FIG. 21 is a schematic diagram showing an example of the answeringsentence generating rule file.

FIG. 22 is a schematic diagram showing an example of a rule table.

FIG. 23 is a schematic diagram showing an example of the programstructure of a second question/answer block.

FIG. 24 is a flowchart showing the procedure for reproducing secondquestion/answer block.

FIG. 25 is a schematic diagram showing an example of the programstructure of a third question/answer block.

FIG. 26 is a flowchart showing the procedure for reproducing thirdquestion/answer block.

FIG. 27 is a schematic diagram showing an example of the programstructure of a fourth question/answer block.

FIG. 28 is a flowchart showing the procedure for reproducing fourthquestion/answer block.

FIG. 29 is a schematic diagram showing an example of the programstructure of a first dialogue block.

FIG. 30 is a schematic diagram showing an example of the programstructure of the first dialogue block.

FIG. 31 is a flowchart showing the procedure for reproducing firstdialogue block.

FIG. 32 is a conceptual view showing the list of insertion prompts.

FIG. 33 is a schematic diagram showing an example of the programstructure of a second dialogue block.

FIG. 34 is a schematic diagram showing an example of the programstructure of the second dialogue block.

FIG. 35 is a flowchart showing the procedure for reproducing seconddialogue block.

FIG. 36 is a flowchart for explaining a dialogue system by artificialunintelligence.

BEST MODE FOR CARRYING OUT THE INVENTION

An embodiment of the present invention will be described in detail withreference to the accompanying drawings.

(1) General Structure of Robot According to this Embodiment

Referring to FIGS. 1 and 2, reference numeral 1 generally shows abipedal robot according to this embodiment. A head unit 3 is disposed ona body unit 2, arm units 4A and 4B having the same structure aredisposed on the upper left part and the upper right upper part of theabove body unit 2 respectively, and leg units 5A and 5B having the samestructure are attached to predetermined positions on the left lower partand the right lower part of the body unit 2 respectively.

In the body unit 2, a frame 10 forming the upper part of a torso and awaist base 11 forming the lower part of the torso are connected via awaist joint mechanism 12. The actuators A₁ and A₂ of the waist jointmechanism 12 fixed to the waist base 11 forming the lower part of thetorso are respectively driven, so that the upper part of the torso canbe turned according to the respectively independent turn of a roll shaft13 and a pitch shaft 14 that are orthogonal, shown in FIG. 3.

The head unit 3 is attached to the top center part of a shoulder base 15fixed to the upper ends of a frame 10 via a neck joint mechanism 16. Theactuators A₃ and A₄ of the above neck joint mechanism 16 arerespectively driven, so that the head unit 3 can be turned according tothe respectively independent turn of a pitch shaft 17 and a yaw shaft 18that are orthogonal, shown in FIG. 3.

The arm units 4A and 4B are attached to the left end and the right endof the shoulder base 15 via a shoulder joint mechanism 19 respectively.The actuators A₅ and A₆ of the corresponding shoulder joint mechanism 19are respectively driven, so that the arm units 4A and 4B can be turnedrespectively independently, according to the turn of a pitch shaft 20and a roll shaft 21 that are orthogonal, shown in FIG. 3.

In this case, in each of the arm units 4A and 4B, an actuator A₈ forminga forearm part is connected to the output shaft of an actuator A₇forming an upper arm part via an arm joint mechanism 22. A hand part 23is attached to the end of the above forearm part.

In the arm units 4A and 4B, the forearm parts can be turned according tothe turn of yaw shafts 24 shown in FIG. 3 by driving the actuator A₇,and the forearm parts can be turned according to the turn of pitchshafts 25 shown in FIG. 3 by driving the actuator A₈.

On the other hand, the leg units 5A and 5B are attached to the waistbase 11 forming the lower part of the torso via a hip joint mechanism 26respectively. The actuators A₉ to A₁₁ of the corresponding hip jointmechanism 26 are driven respectively, so that the hip joint mechanisms26 can be turned respectively independently, according to the turn of ayaw shaft 27, a roll shaft 28 and a pitch shaft 29 that are mutuallyorthogonal, shown in FIG. 3.

In this case, in each of the leg units 5A and 5B, a frame 32 forming anunderthigh part is connected to the lower end of the frame 30 forming athigh part via a knee joint mechanism 31, and a foot part 34 isconnected to the lower end of the above frame 32 via an ankle jointmechanism 33.

Thereby, in the leg units 5A and 5B, the underthigh parts can be turnedaccording to the turn of pitch shafts 35 shown in FIG. 3 by drivingactuators A₁₂ forming the knee joint mechanisms 31. Furthermore, thefoot parts 34 can be turned respectively independently, according to theturn of a pitch shaft 36 and a roll shaft 37 that are orthogonal, shownin FIG. 3, by respectively driving the actuators A₁₃ and A₁₄ of theankle joint mechanism 33.

On the back side of the waist base 11 forming the lower part of thetorso of the body unit 2, as shown in FIG. 4, a control unit 42 in whicha main control part 40 for controlling the entire movements of the aboverobot 1, a peripheral circuit 41 such as a power supply circuit and acommunication circuit, a battery 45 (FIG. 5), etc. are contained in abox, is disposed.

This control unit 42 is connected to each of sub control parts 43A to43D respectively disposed in the forming units (the body unit 2, headunit 3, arm units 4A and 4B, and leg units 5A and 5B). Thereby, anecessary power supply voltage can be supplied to these sub controlparts 43A to 43D, and the control unit 42 can perform communication withthese sub control parts 43A to 43D.

Each of the sub control parts 43A to 43D is connected to the actuatorsA₁ to A₁₄ in the respectively corresponding forming unit, so that eachof the actuators A₁ to A₁₄ in the above forming units can be driven intoa state where it was specified based on various control commands givenfrom the main control part 40, respectively.

In the head unit 3, as shown in FIG. 5, various external sensors such asa charge coupled device (CCD) camera 50 having a function as “eye” ofthis robot 1, a microphone 51 having a function as “ear”, and a speaker52 having a function as “mouse”, are disposed on respectivepredetermined positions. Touch sensors 53 are disposed on the hand parts23 and the foot parts 34 as external sensors. Furthermore, in thecontrol unit 42, internal sensors such as a battery sensor 54 and anacceleration sensor 55 are contained.

The CCD camera 50 picks up the images of surroundings, and transmitsthus obtained video signal S1A to the main control part 40. Themicrophone 51 picks up various external sounds, and transmits thusobtained audio signal S1B to the main control part 40. And each of thetouch sensors 53 detects a physical touch on an external object, andtransmits the detection results to the main control part 40 as apressure detecting signal S1C.

The battery sensor 54 detects the remaining quantity of the battery 45in a predetermined cycle, and transmits the detection result to the maincontrol part 40 as a remaining battery detecting signal S2A. And theacceleration sensor 55 detects acceleration in the three axis directions(x-axis, y-axis and z-axis) in a predetermined cycle, and transmits thedetection result to the main control part 40 as an accelerationdetecting signal S2B.

The main control part 40 has the configuration of a microcomputer havinga central processing unit (CPU), an internal memory 40A serving as aread only memory (ROM) and a random access memory (RAM), etc. The maincontrol part 40 determines the surrounding state and the internal stateof the robot 1, by whether an external object touched or not, or thelike, based on external sensor signals S1 such as the video signal S1A,the audio signal S1B and the pressure detecting signal S1C that arerespectively supplied from each external sensor such as the CCD camera50, the microphone 51 and the touch sensors 53, and internal sensorsignals S2 such as the remaining battery detecting signal S2A and theacceleration detecting signal S2B that are respectively supplied fromeach internal sensor such as the battery sensor 54 and the accelerationsensor 55.

Then, the main control part 40 determines the next movement based onthis determination result, a control program previously stored in theinternal memory 40A, and various control parameters stored in anexternal memory 56 being loaded at the time, and transmits a controlcommand based on the determination result to the corresponding subcontrol part 43A-43D. As a result, the corresponding actuator A₁-A₁₄ isdriven based on this control command, under the control of that subcontrol part 43A-43D. Thus, movements such as swinging the head unit 3in all directions, raising the arm units 4A and 4B, and walking areappeared by the robot 1.

The main control part 40 recognizes the contents of the user's utteranceby predetermined speech recognition processing to the above audio signalS1B supplied from the microphone 51, and supplies an audio signal S3according to the above recognition to the speaker 52. Thereby, asynthetic voice to perform a dialogue with the user is emitted to theoutside.

In this manner, this robot 1 can move autonomously based on thesurrounding state and the internal state, and also can make a dialoguewith the user.

(2) Processing by Main Control Part 40 Relating to Dialogue Control

(2-1) Contents of Processing by Main Control Part 40 Relating toDialogue Control

Next, the contents of processing by the main control part 40 relating todialogue control will be described.

If classifying the contents of processing by the main control part 40relating to dialogue control in this robot 1 by function, as shown inFIG. 6, they can be classified into a speech recognition part 60 forperforming voice recognition to the voice uttered by the user, ascenario reproducing part 62 for controlling a dialogue with the userbased on the recognition result by the above speech recognition part 60,according to a scenario 61 previously given, a response generating part63 for generating an answering sentence responding to a request from thescenario reproducing part 62, and a voice synthesis part 64 forgenerating a synthetic voice of one sentence of the scenario 61reproduced by the scenario reproducing part 62 or the answering sentencegenerated by the response generating part 63. Note that, in thedescription below, it is defined that “one sentence” means one unitpaused in utterance: this “one sentence” may not be always “a piece ofsentence”.

Here, the speech recognition part 60 has the function to executepredetermined speech recognition processing based on the audio signalS1B supplied from the microphone 51 (FIG. 5) and recognize the speechincluded in the above audio signal S1B in word unit. The speechrecognition part 60 supplies these recognized words to the scenarioreproducing part 62 as character string data D1.

The scenario reproducing part 62 manages speech (prompt) that has beenpreviously given by being stored in the external memory 56 (FIG. 5), andshould be uttered by the above robot 1 in the process of a series ofdialogue with the user, by reading data for plural scenarios 61 providedover plural turns from the above external memory 56 to the internalmemory 40A.

In a dialogue with the user, in these plural scenarios 61, the scenarioreproducing part 62 selects a scenario 61 suited to the user who wasrecognized and identified by a face recognition part not shown based onthe picture signal S1A supplied from the CCD camera 50 (FIG. 5), andbecomes the other party of the dialogue, and reproduces the scenario 61.Thereby, character string data D2 corresponding to the voice uttered bythe robot 1 is sequentially supplied to the voice synthesis part 64.

Furthermore, if the scenario reproducing part 62 confirms that the usergave unexpected utterance as an answer to the question that the robot 1asked, based on the character string data D1 supplied from the speechrecognition part 60, the scenario reproducing part 62 supplies the abovecharacter string data D1 and an answering sentence generation requestCOM to the response generating part 63.

The response generating part 63 is formed by an artificialunintelligence module for generating an answering sentence by simpleanswering sentence generation algorithm such as the Eliza engine. If theanswering sentence generation request COM is supplied from the scenarioreproducing part 62, the response generating part 63 generates ananswering sentence according to the character string data D1 that wassupplied together with the answering sentence generation request COM,and supplies its character string data D3 to the voice synthesis part 64via the scenario reproducing part 62.

The voice synthesis part 64 generates synthetic voice based on thecharacter string data D2 supplied from the scenario reproducing part 62or the character string data D3 supplied from the response generatingpart 63 via the above scenario reproducing part 62, and supplies thusobtained audio signal S3 of the above synthetic voice to the speaker 52(FIG. 5). Therefore, the synthetic voice based on this audio signal S3is emitted from the speaker 52.

In this manner, in this robot 1, utterance by a combination of “dialoguehaving no scenario” and “dialogue having scenario” can be performed.Thereby, for example, even if the user replied unexpected words to thequestion by the robot 1, the robot 1 can suitably respond to this.

(2-2) Configuration of Scenario 61

(2-2-1) General Configuration of Scenario 61

Next, the configuration of the scenario 61 in this robot 1 will bedescribed.

In the case of this robot 1, as shown in FIG. 7, each scenario 61 isformed by arraying an arbitrary number of plural kinds of blocks BL(BL1-BL8) providing an action of the robot 1 for one turn in a dialogueincluding one sentence that should be uttered by the robot 1, inarbitrary order.

Here, in the case of this robot 1, as the above program providing anaction for one turn including the contents of utterance of the robot 1in a dialogue with the user (hereinafter, this is referred to as blockBL (BL1-BL8)), there are eight types of blocks BL1-BL8. Next, theconfiguration of each of these eight types of blocks BL1-BL8 andreproducing procedure of each of these eight types of blocks BL1-BL8 bythe scenario reproducing part 62 will be described.

Note that, “one sentence scenario block BL1” and “question block BL2”which will be described next exist already, and each block BL3-BL8 whichwill be described following them does not exist ever and is peculiar tothis robot 1.

Furthermore, in the following FIGS. 9, 11, 14, 23, 25, 27, 29, 30, 33and 34, each script (program configuration) will be described accordingto the rule shown in FIG. 8. In the reproducing processing of each blockBL, the scenario reproducing part 62 supplies character string data D2to the voice synthesis part 64 and gives an answering sentencegeneration request to the response generating part 63, according to thisrule.

(2-2-2) One Sentence Scenario Block BL1

The one sentence scenario block BL1 is a block BL composed of only onesentence in the scenario 61, and for example it has a programconfiguration shown in FIG. 9.

When in reproducing the one sentence scenario block BL1, according to aprocedure for reproducing one sentence scenario block RT1 shown in FIG.10, in step SP1, the scenario reproducing part 62 reproduces onesentence provided by the block maker, and supplies its character stringdata D2 to the voice synthesis part 64. Then, the scenario reproducingpart 62 stops the reproducing processing of this one sentence scenarioblock BL1, and then proceeds to the reproducing processing of a block BLfollowing this.

(2-2-3) Question Block BL2

The question block BL2 is a block BL that will be used in the case ofasking the user a question or the like, and for example it has a programconfiguration shown in FIG. 11. In this question block BL2, it urges theuser to utterance, and the robot 1 utters a prompt for positive ornegative provided by the block maker, according to whether or not theuser's answer to the question was positive.

Practically, when in reproducing this question block BL2, according to aprocedure for reproducing question block RT2 shown in FIG. 12, first, instep SP10, the scenario reproducing part 62 reproduces one sentenceprovided by the block maker and supplies its character string data D2 tothe voice synthesis part 64. And then, in the next step SP11, thescenario reproducing part 62 awaits the user's answer (utterance) tothis.

If soon recognizing that the user replied based on the character stringdata D1 from the speech recognition part 60, the scenario reproducingpart 62 proceeds to step SP12 to determine whether or not the contentsof that answer was positive.

If a positive result is obtained in this step SP12, the scenarioreproducing part 62 proceeds to step SP13 to reproduce an answeringsentence for positive and supplies its character string data D2 to thevoice synthesis part 64, and stops the reproducing processing of thisquestion block BL2. Then, the scenario reproducing part 62 proceeds tothe reproducing processing of a block BL following this.

On the contrary, if a negative result is obtained in step SP12, thescenario reproducing part 62 proceeds to step SP14 to determine whetheror not the user's answer that was recognized in step SP11 was negative.

If an affirmative result is obtained in this step SP14, the scenarioreproducing part 62 proceeds to step SP15 to reproduce an answeringsentence for negative and supplies its character string data D2 to thevoice synthesis part 64, and then stops the reproducing processing ofthis question block BL2. Then, the scenario reproducing part 62 proceedsto the reproducing processing of a block BL following this.

On the contrary, if a negative result is obtained in step SP14, thescenario reproducing part 62 stops the reproducing processing of thisquestion block BL2 as it is. Then, the scenario reproducing part 62proceeds to the reproducing processing of a block BL following this.

Note that, in the case of this robot 1, as the means for determiningwhether the user's response was positive or negative, the scenarioreproducing part 62 has a semantics definition file shown in FIG. 13,for example.

The scenario reproducing part 62 determines whether the user's answerwas positive (“positive”) or negative (“negative”) by referring to thissemantics definition file, based on the character string data D1supplied from the speech recognition part 60.

(2-2-4) First Question/Answer Block BL3 (No Loop)

The first question/answer block BL3 is a block BL that will be used inthe case of asking the user a question or the like similarly to theaforementioned question block BL2, and has a program configuration shownin FIG. 14, for example. This first question/answer block BL3 isdesigned so that even if the user's answer to a question or the like wasneither positive nor negative, the robot 1 can respond.

Practically, when in reproducing this first question/answer block BL3,according to a procedure for reproducing first question/answer blockshown in FIG. 15, first, as to steps SP20-SP25, the scenario reproducingpart 62 performs processing similarly to steps SP10-SP14 of theaforementioned procedure for reproducing question block RT2 (FIG. 12).

If a negative result is obtained in step SP24, the scenario reproducingpart 62 supplies an answering sentence generation request COM and a tagdenoting a kind of a rule to generate an answering sentence to begenerated (SPECIFIC, GENERAL, LAST, SPECIFIC ST, GENERAL ST, LAST) forexample shown in FIG. 16, to the response generating part 63 (FIG. 6),with the character string data D1 that was supplied from the speechrecognition part 60 at that time. Note that, the tag which will besupplied to the response generating part 63 by the scenario reproducingpart 62 at this time has already been determined by the block maker (forexample, see the line of node number “1060” in FIG. 14).

At this time, the response generating part 63 has plural files in whichthe generation rule of a corresponding answering sentence has beenprovided, for example shown in FIGS. 17-21, by respectivelycorresponding to each kind of the generation rules of an answeringsentence to be generated. Furthermore, the response generating part 63has a rule table shown in FIG. 22, in which these files have beenrelated to the tags to be supplied from the scenario reproducing part62.

In this manner, the response generating part 63 refers to this ruletable, based on the file, the tag supplied from the scenario reproducingpart 62 and the character string data D1 supplied from the speechrecognition part 60 at that time, generates an answering sentenceaccording to the corresponding generation rule of an answering sentence,and supplies its character string data D3 to the voice synthesis part 64via the scenario reproducing part 62.

Then, the scenario reproducing part 62 stops the reproducing processingof this first question/answer block BL3, and proceeds to the reproducingprocessing of a block BL following this.

(2-2-5) Second Question/Answer Block BL4 (Loop Type 1)

The second question/answer block BL4 is a block BL that will be used inthe case of asking the user a question or the like similarly to thequestion block BL2, and it has a program configuration shown in FIG. 23,for example. This second question/answer block BL4 will be used toprevent that a dialogue becomes unnatural, by considering the contentsof an answering sentence to be generated in the response generating part63 in the case where the user's answer to the question or the like wasneither positive nor negative.

Concretely, for example, in step SP26 of the procedure for reproducingfirst question/answer block RT3 described above with FIG. 15, in thecase where the response generating part 63 generated a request sentencesuch as “Try to say the same thing in different words.” or a questionsentence such as “Is that true?”, if the scenario reproducing part 62proceeds to the reproducing processing of the next block BL after itfinished the processing of step SP26, the user cannot answer the requestor question, so that the dialogue becomes unnatural.

Therefore, in this second question/answer block BL4, it is designed sothat when the response generating part 63 generates an answeringsentence, in the case where there is a possibility to generate aquestion sentence which can be responded by the user by “yes” or “no” asthe above answering sentence, the user's response to this can beaccepted.

Practically, when in reproducing this second question/answer block BL4,according to a procedure for reproducing second question/answer blockRT4 shown in FIG. 24, as to steps SP30-SP36, the scenario reproducingpart 62 performs processing similarly to steps SP20-SP26 of theaforementioned procedure for reproducing third block RT3.

In step SP36, the scenario reproducing part 62 requests the responsegenerating part 63 to generate an answering sentence. In this manner, ifreceiving character string data D3 for the answering sentence generatedby the response generating part 63, the scenario reproducing part 62supplies this to the voice synthesis part 64, and also determineswhether or not the answering sentence is loop type.

Specifically, the response generating part 63 is designed so that whenin supplying the character string data D3 for the answering sentencegenerated by receiving the request from the scenario reproducing part 62to the scenario reproducing part 62, in the case where the answeringsentence is a question sentence or the like that can be answered by theuser by “yes” or “no”, it adds attribute information showing that theanswering sentence is a first loop type to the above character stringdata D3, in the case where the answering sentence is a request sentenceor the like that cannot be answered by the user by “yes” or “no”, itadds attribute information showing that the answering sentence is asecond group type to the above character string data D3, and in the casewhere the answering sentence is a declarative sentence that isunnecessary to be responded by the user, it adds attribute informationshowing that the answering sentence is a noloop type to the abovecharacter string data D3.

In this manner, when in reproducing this second question/answer blockBL4, in step SP36 of the procedure for reproducing secondquestion/answer block RT4, based on the attribute information on theabove answering sentence supplied with the character string data D3 forthe answering sentence from the response generating part 63, if theanswering sentence is the first loop type, the scenario reproducing part62 returns to step SP31, and after that, repeats the processing of stepsSP31-SP36 until an affirmative result is obtained in step SP37.

If an affirmative result is soon obtained in step SP37 by that theresponse generating part 63 generated the noloop type of answeringsentence, the scenario reproducing part 62 stops the reproducingprocessing of this second question/answer block BL4, and then proceedsto the reproducing processing of a block BL following this.

(2-2-6) Third Question/Answer Block BL5 (Loop Type 2)

The third question/answer block BL5 is a block BL that will be used toprevent that a dialogue becomes unnatural, by considering the contentsof an answering sentence to be generated in the response generating part63 in the case where the user's response to a question or the like wasneither positive nor negative, similarly to the second question/answerblock BL4, and it has a program configuration shown in FIG. 25, forexample.

In this case, in this third question/answer block BL5, it is designed sothat when the response generating part 63 generates an answeringsentence, in the case where as the above answering sentence, thesentence which cannot be answered by the user by “yes” or “ino”, forexample, a request sentence such as “Try to say the same thing indifferent words.” or a question sentence such as “How do you think aboutthat?” was generated, the user's response to that can be accepted andthe robot 1 can respond to this.

Practically, when in reproducing this third question/answer block BL5,according to a procedure for reproducing third question/answer block RT5shown in FIG. 26, as to steps SP40-SP46, the scenario reproducing part62 performs processing similarly to steps SP20-SP26 of theaforementioned procedure for reproducing first question/answer block RT3(FIG. 15).

Next, the scenario reproducing part 62 proceeds to step SP47 todetermine whether or not the answering sentence based on the characterstring data D3 is the aforementioned second loop type, based on theattribute information added to the character string data D3 suppliedfrom the response generating part 63.

In the case where that response sentence is the second loop type, thescenario reproducing part 62 returns to step SP46, and after that,repeats the processing of steps SP46-SP48-SP46 until a negative resultis obtained in step SP47.

If positive result is soon obtained in step SP47 by that the responsegenerating part 63 generated the noloop type of answering sentence, thescenario reproducing part 62 stops the reproducing processing of thisthird question/answer block BL5, and then proceeds to the reproducingprocessing of a block BL following this.

(2-2-7) Fourth Question/Answer Block BL6 (Loop Type 3)

The fourth question/answer block BL6 is a block that will be used toprevent that a dialogue becomes unnatural, by considering the contentsof an answering sentence to be generated in the response generating part63 in the case where the user's response to a question or the like wasneither positive nor negative, similarly to the second and the thirdquestion/answer blocks BL4 and BL5, and it has a program configurationshown in FIG. 27, for example.

In this case, in this fourth question/answer block BL6, it is designedso that the scenario reproducing part 62 can cope with both cases thatthe answering sentence generated by the response generating part 63 isthe aforementioned first loop type and that it is the second loop type.

Practically, when in reproducing this fourth question/answer block BL6,according to a procedure for reproducing fourth question/answer blockRT6 shown in FIG. 28, as to steps SP50-SP56, the scenario reproducingpart 62 performs processing similarly to steps SP20-SP26 of theaforementioned procedure for reproducing first question/answer block RT3(FIG. 15).

After the processing of step SP56, the scenario reproducing part 62proceeds to step SP57 to determine whether or not the generatedanswering sentence is either the aforementioned first or second looptype, based on the attribute information added to the character stringdata D3 supplied from the response generating part 63.

In the case where that answering sentence is either of the first and thesecond loop types, the scenario reproducing part 62 proceeds to stepSP58 to determine whether or not the above answering sentence is thefirst loop type.

If an affirmative result is obtained in this step SP58, the scenarioreproducing part 62 returns to step SP51. If a negative result isobtained in step SP58, the scenario reproducing part 62 proceeds to stepSP59 to await the user's response. If a response was made soon, thescenario reproducing part 62 recognizes this based on the characterstring data D1 from the speech recognition part 60, and then returns tostep SP56. After that, the scenario reproducing part 62 repeats theprocessing of steps SP51-SP59 until a negative result is obtained instep SP57.

If a positive result is soon obtained in step SP57 by that the responsegenerating part 63 generated the noloop type of answering sentence, thescenario reproducing part 62 stops the reproducing processing of thisfourth question/answer block BL6, and then proceeds to the reproducingprocessing of a block BL following this.

(2-2-8) First Dialogue Block BL7 (No Loop)

The first dialogue block BL7 is a block BL that will be used to add anopportunity to make the user give utterance, and it has a programconfiguration shown in FIGS. 29 and 30, for example. Note that, FIG. 29shows an example of the program configuration in the case where there isa prompt, and FIG. 30 shows an example of the program configuration inthe case where there is no prompt.

For example, by placing this first dialogue block BL7 immediately afterthe one sentence scenario block BL1 described above with FIGS. 9 and 10,the turns of dialogue can be increased: it can give the user a feelingof “making a dialogue.”

Furthermore, for example, by that the robot 1 reproduces a word (prompt)such as “I think so.”, “Is it wrong?” and “What do you think?”, the userbecomes easy to give utterance. Therefore, in this first dialogue blockBL7, it is designed so that the scenario reproducing part 62 reproducesone sentence (prompt) shown in Fig., before awaiting the user'sutterance. However, because this one sentence sometimes becomesunnecessary depending upon the contents of utterance by the robot 1 inthe block BL reproduced immediately before, it is designed to beomittable.

Practically, when in reproducing this first dialogue block BL7,according to a procedure for reproducing first dialogue block RT7 shownin FIG. 31, first, in step SP60, the scenario reproducing part 62reproduces omittable one prompt, for example, shown in Fig., that hasbeen provided by the block maker as the occasion demands, and then inthe next step SP61, the scenario reproducing part 62 awaits the user'sutterance to that.

If the scenario reproducing part 62 soon recognizes that the useruttered based on the character string data D1 from the speechrecognition part 60, it proceeds to step SP62 to supply the answeringsentence generation request COM to the response generating part 63, withthe above character string data D1.

As a result, an answering sentence is generated in the responsegenerating part 63 based on these character string data D1 and answeringsentence generation request COM, and its character string data D3 issupplied to the voice synthesis part 64 via the scenario reproducingpart 62.

Then, the scenario reproducing part 62 stops the reproducing processingof this first dialogue block BL7, and then proceeds to the reproducingprocessing of a block BL following this.

(2-2-9) Second Dialogue Block BL8 (Loop)

The second dialogue block BL8 is a block BL that will be used to add anopportunity to make the user give utterance same as the first dialogueblock BL7, and it has a program configuration shown in FIG. 33 or 34,for example. Note that, FIG. 33 shows an example of the programconfiguration in the case where there is a prompt, and FIG. 34 shows anexample of the program configuration in the case where there is noprompt.

This second dialogue block BL8 is effective in the case where there is apossibility that in step SP62 of the procedure for reproducing firstdialogue block RT7 described above with FIG. 31., the responsegenerating part 63 generates a question sentence or a request sentenceas the answering sentence.

Practically, when in reproducing this second dialogue block BL8,according to a procedure for reproducing eighth block RT8 shown in FIG.35, as to steps SP70-SP72, the scenario reproducing part 62 performsprocessing similarly to steps SP60-SP62 of the aforementioned procedurefor reproducing first dialogue block RT7 (FIG. 31).

In the next step SP703, the scenario reproducing part 62 determineswhether or not the answering sentence is the second loop type, based onthe aforementioned attribute information added to the character stringdata D3 supplied from the response generating part 63.

If an affirmative result is obtained in this step SP73, the scenarioreproducing part 62 returns to step SP71, and after that, it repeats theloop of steps SP71-SP73 until a negative result is obtained in stepSP73.

If a negative result is soon obtained in step SP73 by that the responsegenerating part 63 generated the no-loop type of answering sentence, thescenario reproducing part 62 stops the reproducing processing of thissecond dialogue block BL8, and then proceeds to the reproducingprocessing of a block BL following this.

(3) Method for Making Scenario 61

Next, a method for making a scenario 61 by use of the above first-ninthblocks BL1-BL9 will be described.

As the method for making the scenario 61 by using the aforementionedvarious configurations of blocks BL1-BL9, there are a first scenariomaking method in which a scenario 61 will be made completely from thebeginning, and a second scenario making method in which a new scenario61 will be made by adding a modification to the existing scenario 61.

In this case, in the first scenario making method, as described abovewith FIG. 7, a desired scenario 61 can be made by aligning an arbitrarynumber of eight kinds of various blocks BL1-BL8 in arbitrary order inseries, and respectively providing a necessary sentence in each block BLaccording to the preference of the person who makes the scenarios.

Furthermore, in the second scenario making method, a new scenario 61 canbe easily made, on the existing scenario 61 composed of theaforementioned one sentence scenario block BL1 and question block BL2,

[1] by changing the question block BL2 with one of the first-the fourthquestion/answer blocks BL3-BL6 (it may be the first or the seconddialogue block BL7 or BL8, depending on the contents of the precedingand the following blocks BL).

[2] by inserting one or more number of the first or the second dialogueblock BL7 or BL8 (it may be the one sentence scenario block BL1, thequestion block BL2 or the first-the fourth question/answer blocksBL3-BL6, depending on the contents of the preceding and the followingblocks BL) immediately after the one sentence scenario block BL1.

(4) Operation and Effects of this Embodiment

According to the above structure, in this robot 1, under the control ofthe scenario reproducing part 62, in the normal state, “dialogue havingscenario” is performed with the user according to the scenario 61, onthe other hand, in the case where the user gave an unexpected responseor the like in the scenario 61, “dialogue having no scenario” isperformed by an answering sentence generated in the response generatingpart 63.

Accordingly, in this robot 1, even if the user gave an unexpectedresponse in the scenario 61, a suitable response can be returned tothis. It can effectively prevent that the story after this becomesunnatural.

Furthermore, in this robot 1, the scenario 61 can be made by aligning anarbitrary number of plural kinds of blocks BL in which the action of therobot 1 for one turn in a dialogue including one sentence to be utteredby the robot 1 has been provided, in arbitrary order. Therefore, makingit is easy, and also interesting scenarios can be easily made with lessprocess by using the existing scenario 61.

According to the above structure, under the control of the scenarioreproducing part 62, in the normal state, “dialogue having scenario” isperformed with the user according to the scenario 61, on the other hand,in the case where the user gave a response unexpected in the scenario 61or the like, “dialogue having no scenario” is performed by an answeringsentence generated in the response generating part 63. Therefore, it canprevent that the dialogue with the user becomes unnatural, and at thesame time, it can give the above user a feeling of “making a dialogue.”Thus, a robot that can make a natural dialogue with the user can berealized.

(5) Other Embodiments

In the aforementioned embodiment, it has dealt with the case where thisinvention is applied to the robot 1 formed as FIGS. 1-5. However, thepresent invention is not only limited to this but also can be widelyapplied to robot apparatuses having various configuration other thanthat, various dialogue systems for making a dialogue with human beingsother than that in other than robot apparatuses, etc.

In the aforementioned embodiments, it has dealt with the case where asblocks BL forming the scenario 61, the aforementioned eight types areprepared. However, the present invention is not only limited to this butalso the scenario 61 may be made by a block having a configuration otherthan these eight types, or the scenario 61 may be made by preparinganother type of block in addition to these eight types.

In the aforementioned embodiments, it has dealt with the case where thesingle response generating part 63 is used. However, the presentinvention is not only limited to this but also for example dedicatedresponse generating parts may be provided by respectively correspondingto the steps for requesting the response generating part 63 to generatean answering sentence in the third-the eighth blocks BL3-BL8 (stepsSP26, SP36, SP46, SP56, SP62 and SP72). Furthermore, two types of them,a response generating part “which does not generate a question sentenceand a request sentence” and a response generating part “that there is apossibility to generate a question and a request sentence” may beprepared, and they may be selectively used depending on the situation.

In the aforementioned embodiments, it has dealt with the case where inthe second-the sixth blocks BL2-BL6, the steps for determining positiveor negative on the user's response (steps SP12, SP14, SP22, SP24, SP32,SP34, SP42, SP44, SP52 and SP54) are provided. However, the presentinvention is not only limited to this but also the step for matchingwith another word may be provided instead of them.

Concretely, for example, it also can be designed so that the robot 1asks the user a question such as “what prefecture did you born?”, anddetermines a prefecture corresponding to the speech recognition resulton the user's answer to this.

In the aforementioned embodiments, it has dealt with the case where thenumber of times of the loop in the fourth-the sixth and the eighthblocks BL4-BL6 and BL8 (steps SP37, SP47, SP57 and SP73) are set tounlimited. However, the present invention is not only limited to thisbut also a counter for counting the number of times of the loop may beprovided to limit the number of times of the loop based on the countednumber of the above counter.

In the aforementioned embodiments, it has dealt with the case where theawaiting time to await the user's utterance is set to unlimited (forexample, step SP11 in the Procedure for reproducing question block RT2).However, the present invention is not only limited to this but also theabove awaiting time may be limited. For instance, it may be designed sothat if the user did not utter in ten seconds after the robot 1 uttered,a response for time-out previously prepared is reproduced and itproceeds to the reproducing processing of the next block BL.

In the aforementioned embodiments, it has dealt with the case where thescenario 61 is formed by aligning the blocks BL in series. However, thepresent invention is not only limited to this but also branches may beprovided in the scenario 61 by arranging blocks BL in parallel or thelike.

In the aforementioned embodiments, it has dealt with the case where therobot 1 appears only voice in a dialogue with the user. However, thepresent invention is not only limited to this but also a motion (action)may be appeared in addition to voice.

In the aforementioned embodiments, it has dealt with the case whererequests from the user are not accepted. However, the present inventionis not only limited to this but also the scenario 61 may be made so thatrequests from the user such as “Stop.” and “I beg your pardon.” can beaccepted.

In the aforementioned embodiments, it has dealt with the case where thespeech recognition part 60 serving as speech recognition means forperforming speech recognition on the user's utterance, the scenarioreproducing part 62 serving as dialogue control means for controlling adialogue with the user according to the scenario 61 previously given,based on the speech recognition result by the speech recognition part60, the response generating part 63 serving as response generating meansfor generating an answering sentence according to the contents of theuser's utterance, responding to a request from the scenario reproducingpart 62, and the voice synthesis part 64 serving as voice synthesismeans for performing voice synthesis processing to one sentence of thescenario 61 reproduced by the scenario reproducing part 62 or theanswering sentence generated by the response generating part 63 arecombined as shown in FIG. 6. However, the present invention is not onlylimited to this but also for example character string data D3 suppliedfrom the response generating part 63 may be directly supplied to thevoice synthesis part 64. As the combination of these speech recognitionpart 60, scenario reproducing part 62, response generating part 63 andvoice synthesis part 64, various combinations other than this can bewidely applied.

According to the present invention as described above, in a voicedialogue system, dialogue control means for controlling a dialogue withthe user according to a scenario previously given, based on the speechrecognition result by speech recognition means for performing speechrecognition on the user's utterance, and response generating means forgenerating an answering sentence according to the contents of the user'sutterance, responding to a request from the dialogue control means areprovided. The dialogue control means requests the response generatingmeans to generate an answering sentence as the occasion demands, basedon the contents of the user's utterance. Thereby, it can be preventedthat the dialogue with the user becomes unnatural, and at the same time,a feeling of “making a dialogue” can be given to the above user. Thus, avoice dialogue system capable of making a natural dialogue with the usercan be realized.

According to the present invention, a first step for performing speechrecognition on the user's utterance, a second step for controlling adialogue with the user according to a scenario previously given based onthe speech recognition result, and generating an answering sentenceaccording to the contents of the user's utterance as the occasiondemands, and a third step for performing voice synthesis processing toone sentence of the reproduced scenario or the generated answeringsentence are provided. In the second step, an answering sentenceaccording to the contents of the user's utterance is generated as theoccasion demands, based on the contents of the user's utterance, so thatit can be prevented that the dialogue with the user becomes unnatural,and at the same time, a feeling of “making a dialogue” can be given tothe above user. Thus, a voice dialogue method in which a naturaldialogue can be performed with the user can be realized.

Furthermore, according to the present invention, in a robot apparatus,dialogue control means for controlling a dialogue with the useraccording to a scenario previously given, based on speech recognitionresult by speech recognition means for performing speech recognition onthe user's utterance, and response generating means for generating ananswering sentence according to the contents of the user's utterance,responding to a request from the dialogue control means are provided.The dialogue control means requests the response generating means togenerate an answering sentence as the occasion demands, based on thecontents of the user's utterance. Thereby, it can be prevented that thedialogue with the user becomes unnatural, and at the same time, afeeling of “making a dialogue” can be given to the above user. Thus, arobot apparatus capable of making a natural dialogue with the user canbe realized.

INDUSTRIAL UTILIZATION

The present invention is widely applicable to various apparatuses havinga voice dialogue function such as personal computers in addition toentertainment robots.

1. A voice dialogue system comprising: speech recognition means forperforming speech recognition on the user's utterance; dialogue controlmeans for controlling a dialogue with said user according to a scenariopreviously given, based on the speech recognition result by said speechrecognition means; response generating means for generating an answeringsentence corresponding to the contents of said user's utterance,responding to a request from said dialogue control means; and speechsynthesis means for performing speech synthesis processing to onesentence in said scenario reproduced by said dialogue control means orsaid answering sentence generated by said response generating means; andsaid voice dialogue system wherein, said dialogue control means requestssaid response generating means to generate said answering sentence asthe occasion demands, based on the contents of said user's utterance. 2.The voice dialogue system according to claim 1, wherein; said dialoguecontrol means controls said dialogue with said user based on theattribute of said answering sentence generated by said responsegenerating means.
 3. The voice dialogue system according to claim 1,wherein; said scenario is made by combining an arbitrary number ofplural types of blocks in a respectively predetermined format providingfor one turn of a dialogue with said user, in an arbitrary order.
 4. Thevoice dialogue system according to claim 3, comprising; as one of saidblocks, a first block having, a first reproducing step for reproducingsaid one sentence to urge said user to utterance, a first utteranceawait and recognition step for awaiting said user's utterance after theabove first reproducing step, and when said user uttered, recognizingthe contents of the above utterance, and a second reproducing step,following said first utterance await and recognition step, forreproducing corresponding one sentence previously provided, depending onwhether the contents of the above utterance is positive or negative. 5.The voice dialogue system according to claim 4, comprising; as one ofsaid blocks, a second block having a first generation of answeringsentence request step, when the contents of said user's utterancerecognized in said first utterance await and recognition step is neithersaid positive nor said negative, for requesting said response generatingmeans to generate said answering sentence corresponding to said contentsof said user's utterance.
 6. The voice dialogue system according toclaim 5, comprising; as one of said blocks, a third block having a firstloop in which if the attribute of said answering sentence, that wasgenerated by said response generating part responding to said request insaid first generation of answering sentence request step, is the firstloop type, it returns to said first utterance await and recognitionstep.
 7. The voice dialogue system according to claim 5, comprising; asone of said blocks, a fourth block having a second loop in which if theattribute of said answering sentence, that was generated by saidresponse generating part responding to said request in said firstgeneration of answering sentence request step, is the second loop type,it awaits said user's utterance, and when said user uttered, itrecognizes the contents of the above utterance, and then returns to saidgeneration of answering sentence request step.
 8. The voice dialoguesystem according to claim 5, comprising; as one of said blocks, a fifthblock having, determination step for determining the attribute of saidanswering sentence, that was generated by said response generating partresponding to said request in said first generation of answeringsentence request step, a first loop in which if said attribute of said,answering sentence determined in the above determination step is thefirst loop type, it returns to said first utterance await andrecognition step, and a second loop in which if said attribute of saidanswering sentence determined in the above determination step is thesecond loop type, it awaits said user's utterance, and when said useruttered, it recognizes the contents of the above utterance, and thenreturns to said generation of answering sentence request step.
 9. Thevoice dialogue system according to claim 3, comprising; as one of saidblocks, a sixth block having, a second reproducing step for reproducingsaid one sentence omittable in said scenario if needed, a secondutterance await and recognition step, for awaiting said user's utteranceafter said second reproducing step, and when said user uttered, forrecognizing the contents of the above utterance, and a second generationof answering sentence request step, following said second utteranceawait and recognition step, for requesting said response generatingmeans to generate said answering sentence corresponding to said contentsof said user's utterance.
 10. The voice dialogue system according toclaim 9, comprising; as one of said blocks, a seventh block having athird loop in which if the attribute of said answering sentence, thatwas generated by said response generating part responding to saidrequest in said second generation of answering sentence request step, isthe third loop type, it returns to said second utterance await andrecognition step.
 11. A voice dialogue method comprising: a first stepfor performing speech recognition on the user's utterance; a second stepfor controlling a dialogue with said user according to a scenariopreviously given, based on the results of said speech recognition, andif needed, generating an answering sentence corresponding to thecontents of said user's utterance; and a third step for performingspeech synthesis processing to one sentence in said reproduced scenarioor said generated answering sentence; and said voice dialogue methodwherein, in said second step, said answering sentence corresponding tothe contents of said user's utterance is generated as the occasiondemands, based on the contents of said user's utterance.
 12. The voicedialogue method according to claim 11, wherein; in said second step,said dialogue with said user is controlled based on the attribute ofsaid generated answering sentence.
 13. The voice dialogue methodaccording to claim 11, wherein; said scenario is made by combining anarbitrary number of plural types of blocks in a respectivelypredetermined format providing for one turn of a dialogue with saiduser, in an arbitrary order.
 14. The voice dialogue method according toclaim 13, comprising; as one of said blocks, a first block having, afirst reproducing step for reproducing said one sentence to urge saiduser to utterance, a first utterance await and recognition step forawaiting said user's utterance after the above first reproducing step,and when said user uttered, recognizing the contents of the aboveutterance, and a second reproducing step, following said first utteranceawait and recognition step, for reproducing corresponding one sentencepreviously provided, depending on whether the contents of the aboveutterance is positive or negative.
 15. The voice dialogue methodaccording to claim 14, comprising; as one of said blocks, a second blockhaving a first generation of answering sentence request step, when thecontents of said user's utterance recognized in said first utteranceawait and recognition step is neither said positive nor said negative,for generating said answering sentence corresponding to said contents ofsaid user's utterance.
 16. The voice dialogue method according to claim15, comprising; as one of said blocks, a third block having a first loopin which if the attribute of said answering sentence generated in saidfirst answering sentence generating step is the first loop type, itreturns to said first utterance await and recognition step.
 17. Thevoice dialogue method according to claim 15, comprising; as one of saidblocks, a fourth block having a second loop in which if the attribute ofsaid answering sentence generated in said first answering sentencegenerating step is the second loop type, it awaits said user'sutterance, and when said user uttered, it recognizes the contents of theabove utterance, and then returns to said answering sentence generatingstep.
 18. The voice dialogue method according to claim 15, comprising;as one of said blocks, a fifth block having, determination step fordetermining the attribute of said answering sentence generated in saidfirst answering sentence generating step, a first loop in which if saidattribute of said answering sentence determined in the abovedetermination step is the first loop type, it returns to said firstutterance await and recognition step, and a second loop in which if saidattribute of said answering sentence determined in the abovedetermination step is the second loop type, it awaits said user'sutterance, and when said user uttered, it recognizes the contents of theabove utterance, and then returns to said answering sentence generatingstep.
 19. The voice dialogue method according to claim 13, comprising;as one of said blocks, a sixth block having, a second reproducing stepfor reproducing said one sentence omittable in said scenario if needed,a second utterance await and recognition step, for awaiting said user'sutterance after said second reproducing step, and when said useruttered, for recognizing the contents of the above utterance, and asecond answering sentence generating step, following said secondutterance await and recognition step, for generating said answeringsentence corresponding to said contents of said user's utterance. 20.The voice dialogue method according to claim 19, comprising; as one ofsaid blocks, a seventh block having a third loop in which if theattribute of said answering sentence generated in said second answeringsentence generating step is the third loop type, it returns to saidsecond utterance await and recognition step.
 21. A robot apparatuscomprising: speech recognition means for performing speech recognitionon the user's utterance; dialogue control means for controlling adialogue with said user according to a scenario previously given, basedon the speech recognition result by said speech recognition means;response generating means for generating an answering sentencecorresponding to the contents of said user's utterance, responding to arequest from said dialogue control means; and speech synthesis means forperforming speech synthesis processing to one sentence in said scenarioreproduced by said dialogue control means or said answering sentencegenerated by said response generating means; and said robot apparatuswherein, said dialogue control means requests said response generatingmeans to generate said answering sentence as the occasion demands, basedon the contents of said user's utterance.